|
|
Up |
|
|
  |
Author: John DotyJohn Doty Date: Oct 11, 2006 13:38
I realized that I have a little job on the table that is a fine test of
the Python versus Standard Forth code availability and reusability issue.
Note that I have little experience with either Python or Standard Forth
(but I have much experience with a very nonstandard Forth). I've noodled
around a bit with both gforth and Python, but I've never done a serious
application in either. In my heart, I'm more of a Forth fan: Python is a
bit too much of a black box for my taste. But in the end, I have work to
get done.
The problem:
I have a bunch of image files in FITS format. For each raster row in
each file, I need to determine the median pixel value and subtract it
from all of the pixels in that row, and then write out the results as
new FITS files.
This is a real problem I need to solve, not a made-up toy problem. I was
originally thinking of solving it in C (I know where to get the pieces
in that language), but it seemed like a good test problem for the Python
versus Forth issue.
I looked to import FITS reading/writing, array manipulation, and median
determination. From there, the solution should be pretty easy.
|
| Show full article (3.80Kb) |
|
| | 22 Comments |
|
  |
Date: Oct 11, 2006 13:53
John Doty whispertel.LoseTheH.net> writes:
> I have a bunch of image files in FITS format. For each raster row in
> each file, I need to determine the median pixel value and subtract it
> from all of the pixels in that row, and then write out the results as
> new FITS files.
I dunno what FITS is, but if you have a list of pixel values, that
calculation sounds like two lines:
median = sorted(pixels)[len(pixels)//2]
new_pixels = [p-median for p in pixels]
|
| |
|
| | 1 Comment |
|
  |
Author: Coos HaakCoos Haak Date: Oct 11, 2006 13:54
Op Wed, 11 Oct 2006 14:38:51 -0600 schreef John Doty:
> OK, now for Forth. Googling for "forth dup swap median" easily found:
>
> http://www.taygeta.com/fsl/library/find.seq
>
> At first blush, this looked really good for Forth. The search zeroed in
> on just what I wanted, no extras. The algorithm is better than the one
> in the Python stats module: it gives exact results, so there's no need
> to check that an approximation is good enough. But then, the
> disappointment came.
>
> What do you do with this file? It documents the words it depends on, but
> not where to get them. I'm looking at a significant effort to assemble
> the pieces here, an effort I didn't suffer through with Python. So, my
> first question was: "Is it worth it?".
|
| Show full article (1.19Kb) |
| 1 Comment |
|
  |
Author: John DotyJohn Doty Date: Oct 11, 2006 14:08
Paul Rubin wrote:
> John Doty whispertel.LoseTheH.net> writes:
>> I have a bunch of image files in FITS format. For each raster row in
>> each file, I need to determine the median pixel value and subtract it
>> from all of the pixels in that row, and then write out the results as
>> new FITS files.
>
> I dunno what FITS is, but if you have a list of pixel values, that
> calculation sounds like two lines:
>
> median = sorted(pixels)[len(pixels)//2]
> new_pixels = [p-median for p in pixels]
Yes. The efficient exact algorithms for this problem use *partial*
sorts. The Forth one from the FSL is of this class (although I know of
two better ones for big arrays). But it's tough to beat the efficiency
of the approximate histogram-based method the Python stats module
implements.
|
| Show full article (0.87Kb) |
| no comments |
|
  |
Author: bearophileHUGSbearophileHUGS Date: Oct 11, 2006 14:58
John Doty:
> Yes. The efficient exact algorithms for this problem use *partial*
> sorts. The Forth one from the FSL is of this class (although I know of
> two better ones for big arrays). But it's tough to beat the efficiency
> of the approximate histogram-based method the Python stats module
> implements.
The usual way to compute a true median with Python may be:
def median(inlist):
newlist = sorted(inlist)
index = len(newlist) // 2
if len(newlist) %% 2:
return newlist[index]
else:
return (newlist[index] + newlist[index-1]) / 2.0
|
| Show full article (1.00Kb) |
| no comments |
|
  |
Author: jackojacko Date: Oct 11, 2006 15:22
> John Doty:
>> Yes. The efficient exact algorithms for this problem use *partial*
>> sorts. The Forth one from the FSL is of this class (although I know of
>> two better ones for big arrays). But it's tough to beat the efficiency
>> of the approximate histogram-based method the Python stats module
>> implements.
>
> The usual way to compute a true median with Python may be:
>
> def median(inlist):
> newlist = sorted(inlist)
> index = len(newlist) // 2
> if len(newlist) %% 2:
> return newlist[index]
> else:
> return (newlist[index] + newlist[index-1]) / 2.0
>
> If you can use Psyco and your FITS lines are really long (well, maybe
> too much, the treshold if about >~3000 in my PC) you can use something ...
|
| Show full article (1.43Kb) |
| no comments |
|
  |
Author: John DotyJohn Doty Date: Oct 11, 2006 15:34
Coos Haak wrote:
> Op Wed, 11 Oct 2006 14:38:51 -0600 schreef John Doty:
>
>
>> OK, now for Forth. Googling for "forth dup swap median" easily found:
>>
>> http://www.taygeta.com/fsl/library/find.seq
>>
>> At first blush, this looked really good for Forth. The search zeroed in
>> on just what I wanted, no extras. The algorithm is better than the one
>> in the Python stats module: it gives exact results, so there's no need
>> to check that an approximation is good enough. But then, the
>> disappointment came.
>>
>> What do you do with this file? It documents the words it depends on, but
>> not where to get them. I'm looking at a significant effort to assemble
>> the pieces here, an effort I didn't suffer through with Python. So, my
>> first question was: "Is it worth it?".
>
> I haven't use the FSL at all, but trimming the url and looking I found ...
|
| Show full article (2.07Kb) |
| no comments |
|
  |
Author: idknowidknow Date: Oct 11, 2006 19:25
> John Doty:
>> Yes. The efficient exact algorithms for this problem use *partial*
>> sorts. The Forth one from the FSL is of this class (although I know of
>> two better ones for big arrays). But it's tough to beat the efficiency
>> of the approximate histogram-based method the Python stats module
>> implements.
>
> The usual way to compute a true median with Python may be:
>
> def median(inlist):
> newlist = sorted(inlist)
> index = len(newlist) // 2
> if len(newlist) %% 2:
> return newlist[index]
> else:
> return (newlist[index] + newlist[index-1]) / 2.0
>
[snip]
|
| Show full article (0.73Kb) |
| no comments |
|
  |
Author: Stephen J. BevanStephen J. Bevan Date: Oct 11, 2006 20:49
John Doty whispertel.LoseTheH.net> writes:
> I realized that I have a little job on the table that is a fine test
> of the Python versus Standard Forth code availability and reusability
> issue.
[snip]
> The answer came from searching for FITS support in Forth. If it exists
> in public, it must be really well hidden. That's a "show stopper", so
> there was no point in pursuing the Forth approach further.
Did you try looking for FITS support in languages other than Python?
I tried TCL, Ruby, Scheme and Smalltalk. I could find something for
TCL but not the other three. It seems that Forth is not alone in
failing the FITS test.
> But I think this little experiment shows that for the rest of us,
> Python has a published base of reusable code that puts Forth to shame.
Surely you didn't need to do the little experiment to come to that
conclusion? The more users there of a language, there more chance is
that one of them will have already written some code to do what you
want and made it available.
|
| Show full article (1.54Kb) |
| 5 Comments |
|
  |
|
|
  |
Author: Don SeglioDon Seglio Date: Oct 11, 2006 21:33
Stephen J. Bevan wrote:
> John Doty whispertel.LoseTheH.net> writes:
>> I realized that I have a little job on the table that is a fine test
>> of the Python versus Standard Forth code availability and reusability
>> issue.
> [snip]
>> The answer came from searching for FITS support in Forth. If it exists
>> in public, it must be really well hidden. That's a "show stopper", so
>> there was no point in pursuing the Forth approach further.
>
> Did you try looking for FITS support in languages other than Python?
> I tried TCL, Ruby, Scheme and Smalltalk. I could find something for
> TCL but not the other three. It seems that Forth is not alone in
> failing the FITS test.
>
>
>> But I think this little experiment shows that for the rest of us,
>> Python has a published base of reusable code that puts Forth to shame.
>
> Surely you didn't need to do the little experiment to come to that ...
|
| Show full article (1.93Kb) |
| no comments |
|
|
|
|