Talk:Least-squares spectral analysis/Archive 2

For new discussion topics, please make a new section at the bottom.

Archived
Now that User:Geoeg is blocked indefinitely, it should be possible to focus on technical details and a neutral description of the technique and its history. Anyone who is interested in the acrimonious discussion we've had for the last month can check the archive. Dicklyon 15:18, 7 November 2007 (UTC)


 * Thank goodness. That was hard to watch, and no doubt many times harder to be involved in.  --Bob K 19:08, 7 November 2007 (UTC)


 * There is a RfC tag in the archive. Please remove it if comments are no longer required.Labongo 13:12, 16 November 2007 (UTC)


 * Done. Dicklyon 15:28, 16 November 2007 (UTC)

Laundry list section removed
I took out this section that was a list of fragments. I'll leave them here as a reminder that each item should be checked to see what can be said about it, and some sentences should be added where appropriate. If features of a method are to be added, we need to clarify which method they apply to.

Main features

 * Processing any datasets, equidistant or incomplete, regardless of record length


 * Rigorous testing of the statistical null hypothesis


 * Straightforward significance level regime


 * Weighting of the data on a per point basis rather than on a time interval basis


 * Accurate simultaneous detection of field relative dynamics and eigenfrequencies


 * Describes fields uniquely and relatively thanks to output’s linear background noise


 * Removal of unwanted frequencies from a record during the processing


 * Removal of periodic noise from a time series with minimal distortion of the spectrum of the remaining series


 * Setting up of spectral resolution at will


 * Outputs spectra in percentage variance (var%) or decibels (dB)

Korenberg' s Orthogonal Search Method
A colleague at work tried it, and this is his comment:
 * "my PC crashed both times I tried to run the program, and the second time hosed my windows profile such that helpdesk had to be called."

--Bob K (talk) 20:16, 6 February 2008 (UTC)


 * That's why I usually object when people link executable content; can't be trusted. Anyway, I tried talk at that IP editor's talk page (maybe he lost it by changing IP?), and I emailed Korenberg to ask for more info so we can represent his work properly, and I did some more reading and searching and found another variant, Chen & Donoho's "basis pursuit" method.  So looks like this article has some room to grow still, with newer and/or rediscovered variants; we just have to understand the literature better first.  Dicklyon (talk) 01:45, 7 February 2008 (UTC)
 * I got a note back from Mike Korenberg with some papers and explanations of his method. I'll definitely work on a writeup of it, probaby using some of what I took out before.  Also the Chen & Donoho, I think.  Both have advantages. Dicklyon (talk) 05:20, 7 February 2008 (UTC)

The Fast Chi-squared Method
I (David Palmer) have just had a paper accepted for publication in the Astrophysical Journal on a new technique, the Fast Chi-squared Method. The preprint is available on the arXiv as, and GPL'd source code is available on my website. It is a fast technique (FFT-based) for doing weighted least-squares analysis (i.e. Chi-squared) on arbitrarily-spaced data with non-uniform standard errors, to fit a periodicity to an arbitrary number of harmonics (i.e. phase functions that are not simple sinusoids).

Would it be appropriate for me to add a sentence or two on the technique to this page? WP:NOR doesn't apply because I would cite my peer-reviewed paper.DMPalmer (talk) 01:00, 23 January 2009 (UTC)


 * Congrats, and thanks for asking. May I recommend you write something and post it here on the talk page?  Then we can review it and I'll add it to the article if it looks good, and you can avoid any appearance of conflict of interest. Dicklyon (talk) 06:25, 23 January 2009 (UTC)


 * After the Chen and Donoho section, a new section 'Palmer's "Fast Chi-squared" method':
 * David Palmer, of Los Alamos National Laboratory, developed a method for finding the best-fit function to any chosen number of harmonics, allowing more freedom to find non-sinusoidal harmonic functions. This method is a fast technique (FFT-based) for doing weighted least-squares analysis (i.e. Chi-squared) on arbitrarily-spaced data with non-uniform standard errors.  Source code that implements this technique is available here, licensed under the GPL.

OK, I added it, but I left out first name and affiliation from the paragraph, so it's more like Chen and Donoho's. If you get to be more famous than David Donoho, we can say more about you :).  In the mean time, help me improve this article by a couple of things: Thanks. Dicklyon (talk) 01:17, 26 January 2009 (UTC)
 * 1. since you're an expert, can you help clarify the contributions of the various authors? I've done my best, but I'm out of my depth.
 * 2. help improve your chi-squared link, which goes to an article that doesn't mention chi-squared.

Sawadiy wrote "If the data were not sampled at uniformly spaced discrete times, they are “gridded” (e.g., by interpolation, or by nearest neighbor sampling) to estimate what the data values at those times would have been." This is a quote from my paper, but it refers to FFT techniques 'as typically implemented' by other algorithms. I should have made it clearer that this was in contrast to what my algorithm does. My algorithm takes a long time series (actually a pair of them: one for the measurements, one for the reciprocals of their variances), and only populates it at the points where there are actual measurements. The remaining points are left unpopulated, which is equivalent to giving them infinite error bars (0 = reciprocal of variance for those points). The FFTs on the data series and the inverse variance series allow rigorous propagation of errors, so the gaps in the data are treated as intervals of true agnosticism.

A minor detail is that the times of samples are modified slightly to correspond to the nearest grid point. The effect of this is handled in two ways: First the grid points are finely spaced (densely enough that the highest harmonic of the highest frequency is below the Nyquist frequency). Second, there is a refinement step where a frequency interval surrounding the FFT peak (which is quantized by the FFT grid) is examined in more detail for the true peak, and this refinement uses the ungridded sample times.

I apologize for misunderstanding. I read you paper very briefly. I need something (Lomb-Scargle only?) that can deal with few years gaps and 3 hours sample interval under normal conditions. So I decided to add brief comment. --Sawadiy (talk) 20:48, 26 June 2009 (UTC)


 * No problem. Give the code a try.  The algorithm should have no particular trouble with your dataset's timing.  If your signal is strictly sinusoidal you can set the number of harmonics to 1 (fundamental only) and get something better than Lomb Scargle.  If the signal has higher harmonics, then it is a clear win.  (Although your Edit Summary did mention pulsars, in which case you have to worry about Pdot (which adds a dimension to the search space that is not directly supported by my reference code) and barycentering.


 * This assumes that the periodicity maintains coherence across the gaps: if it doesn't then you should analyze each segment individually (still using this algorithm) and look for periods that show up in multiple segments. There is a diagnostic flag that dumps chi-squared values across the entire spectrum instead of just the best fit, and you can set the other parameters so that you get the same set of frequencies. DMPalmer (talk) 04:47, 27 June 2009 (UTC)

Because I am editing a section on my own algorithm, someone should look at this edit and make sure I am not violating Wiki policy. I did what I consider the minimum editing to make the entry correct without reverting Sawadiy's edit entirely.DMPalmer (talk) 15:10, 26 June 2009 (UTC)

Typo in book for Lomb-Scargle matlab implementation?
Link points to google book with a typo in code. $$\tau$$ is calculated as an arctangent only. It should be divided by f4pi(fi) using variables in the code. --Sawadiy (talk) 15:28, 23 June 2009 (UTC)

Necessity of smoothing and tapering?
Should clarifications be in article about smoothing and windowing? From what I understand, tapering is not necessary for LS methods. Meanwhile cts package in R project (with typos in implementation itself) does tapering by default. Whereas here http://benfeylab.wikispaces.com/file/view/Lomb-Scargle+periodogram.pdf authors perform neither tapering nor smoothing. It is not clear. --Sawadiy (talk) 20:48, 26 June 2009 (UTC)