|
|
Up |
|
|
  |
Author: TerenceTerence Date: Aug 21, 2008 22:10
I was re-reading my huge Fortran 77/90/95 language specification
manual and noted that the open file option ACCESS='BINARY' was
accepted as an extension only for Windows up to NT, but not other
platforms.
I can solve some potential problems of upgrading Fortran programs to a
more portable standard, by using DIRECT UNFORMATTED access, but I am
left with the cases where programs read somewhat unknown input data
files of words of bits, and proceed to determine the coding structure
used (e.g. IBM 12 bit binary, IBM 16-bit binary, Qantum 12 bit card
code, common 10,12,16 and 36 character ascii code and so on) before
reopening the file in a more suitable way for reading that particular
structure.
To data I have used the BINARY option and read chunks of data to
search for clues (e.g. searching for CR, LF, CR-LF, LF-CR, and DEL
characters and the character interval counts between each, and the
presence or absence of the top one or two bits in each character ad
whether any hex zero bytes occur).
|
| Show full article (1.48Kb) |
|
| | 9 Comments |
|
  |
Author: nospamnospam Date: Aug 21, 2008 23:15
Terence cantv.net> wrote:
> I was re-reading my huge Fortran 77/90/95 language specification
> manual and noted that the open file option ACCESS='BINARY' was
> accepted as an extension only for Windows up to NT, but not other
> platforms.
Presumably you are talking about some specific, but unmentioned
compiler, as the actual Fortran language specifications certainly don't
say anything about Windows versions. Anyway...
> What is the simple definition of of the expected structure of files
> declared as UNFORMATTED SEQUENTIAL? I had always thought these are
> expected to contain non-data markers.
The standard doesn't say. But containing no non-data markers is not a
realistic expectation (except for file systems where the record
structure is maintained out-of-band, but those aren't common these
days). The most common structure involves adding record-size fields
before (and usually after) the data of the record. The details *DO*
vary. This is essentially not an option for reading non-Fortran files;
it is even a bit problematic for reading Fortran files created by other
compilers.
|
| Show full article (2.79Kb) |
|
| | no comments |
|
  |
Author: Arjen MarkusArjen Markus Date: Aug 21, 2008 23:19
On 22 aug, 08:15, nos...@see.signature (Richard Maine) wrote:
>
> 4. Use direct access unformatted. There are lots of complications, but
> it is doable. I know, as I've done it. You have to do your own record
> management in order to give Fortran the fixed-size chuncks of data that
> it wants. It would take quite a while to elaborate on all the various
> gotchas. (Yes, the last block is one of them). The fact that it can get
> pretty messy is why I pushed for access='stream', which makes it all so
> much easier.
>
Have a look at http;// flibs.sf.net, for that particular approach.
It is - unfortunately - not quite without problems, as Richard
also indicates, but it could be useful nonetheless.
Regards,
Arjen
|
| |
| no comments |
|
  |
Author: robert.corbettrobert.corbett Date: Aug 21, 2008 23:25
On Aug 21, 10:10 pm, Terence cantv.net> wrote:
> I was re-reading my huge Fortran 77/90/95 language specification
> manual and noted that the open file option ACCESS='BINARY' was
> accepted as an extension only for Windows up to NT, but not other
> platforms.
>
> I can solve some potential problems of upgrading Fortran programs to a
> more portable standard, by using DIRECT UNFORMATTED access, but I am
> left with the cases where programs read somewhat unknown input data
> files of words of bits, and proceed to determine the coding structure
> used (e.g. IBM 12 bit binary, IBM 16-bit binary, Qantum 12 bit card
> code, common 10,12,16 and 36 character ascii code and so on) before
> reopening the file in a more suitable way for reading that particular
> structure.
>
> To data I have used the BINARY option and read chunks of data to
> search for clues (e.g. searching for CR, LF, CR-LF, LF-CR, and DEL
> characters and the character interval counts between each, and the
> presence or absence of the top one or two bits in each character ad
> whether any hex zero bytes occur). ...
|
| Show full article (2.59Kb) |
| no comments |
|
  |
Author: nospamnospam Date: Aug 21, 2008 23:59
> There is no reason to expect direct-access
> unformatted files to be more portable than sequential-access
> unformatted files, even among implementations on similar
> operating systems.
While there may be no "reason", my observation shows it to be so.
By "so" I am taking your "more portable" literally in that "more
portable" is not the same thing as "100%% portable." Yes, I also know of
exceptions. But they are distinctly exceptions. You can work with an
awfully lot of compilers and never run into one of the exceptions. Some,
though not all, of the exceptions can be addressed by compiler switches
such as Lahey's /nohed. That's unlike direct access sequential, where
you find differences every time you turn around.
Since I have actually used direct access unformatted for this myself on
a large variety of systems, I'm going to be pretty adamant about claming
that it can be done.
|
| Show full article (1.54Kb) |
| no comments |
|
  |
Author: glen herrmannsfeldtglen herrmannsfeldt Date: Aug 22, 2008 11:06
> The reason unformatted records are called unformatted records
> is that at one time they were. Early computer systems tended
> to use record-based I/O hardware. When all I/O is physically
> record-based, the record format is simply assumed. When
> reading or punching cards, there is no need to guess what the
> record structure might be. When reading or writing unblocked
> open reel tapes, the record structure is indicated by the
> physical inter-record gaps. No data needs to be provided to
> indicate the record structure.
This still done for many tape systems. For disks, most now
use a fixed hardware block size and buffer it in memory to
give the impression of a uniform byte stream to the user.
IBM mainframes use a record oriented I/O system for disks.
For direct access files, each record maps to a physical
disk block allocated to the appropriate size. (It may
be remapped inside the disk/controller hardware, but
the record structure is visible to the OS.)
|
| Show full article (1.03Kb) |
| no comments |
|
  |
Author: TerenceTerence Date: Aug 23, 2008 01:50
Yes, thanks to all responding.
I should have said I worked on early Fortran (post Fortran II) for IBM
in 1960 on, and know what was then "the standard" with respect to the
two FORM formats and two ACCESS methods, right through 370 days.
However, I have found weird changes to what I thought was sacrosanct,
when working (for my next company for 28 years) all over the world on
mainframes and process control and message switching computers, and
finding count bytes stuck into UNFORMATTED SEQUENTIAL input files.
.
To respond to doubts as to what compiler I am referring to, I use CVF/
DVF 6.6c for Windows work, and MS F77 v3.31 for all DOS work; both of
which accept "BINARY" as a sequentail access option. And Lahey accepts
"Transparent" for the same usefull purpose.
Given the above comments, I am now certain I will stick with DIRECT
UNFORMATTED and process the data myself (as usual for variable length
reords).
|
| Show full article (1.26Kb) |
| no comments |
|
  |
Author: Steve LionelSteve Lionel Date: Aug 23, 2008 12:22
Terence wrote:
> I do wish the standard default for RECL on this mode was still bytes
> (and not 4-byte words) and not a compiler option. After all the
> dafault (if not specified) in SEQUENTIAL, is bytes, not 4-byte words -
> just the opposite!
Since you're using CVF, the default for UNFORMATTED access is 4-byte
units of RECL=. This has been the mode of DEC compilers for more than
30 years and comes from the F77 standard's use of the term "storage
units" being interpreted as "numerical storage units" - that is, the
size of an INTEGER or REAL.
In Fortran 2003, the standard still allows this but recommends the use
of bytes (I forget the exact wording).
--
Steve Lionel
Developer Products Division
Intel Corporation
Nashua, NH
For email address, replace "invalid" with "com"
|
| Show full article (1.03Kb) |
| no comments |
|
  |
Author: TerenceTerence Date: Aug 23, 2008 15:31
Steve Lionel wrote:
> Terence wrote:
>
>> I do wish the standard default for RECL on this mode was still bytes
>> (and not 4-byte words) and not a compiler option. After all the
>> dafault (if not specified) in SEQUENTIAL, is bytes, not 4-byte words -
>> just the opposite!
>
> Since you're using CVF, the default for UNFORMATTED access is 4-byte
> units of RECL=. This has been the mode of DEC compilers for more than
> 30 years and comes from the F77 standard's use of the term "storage
> units" being interpreted as "numerical storage units" - that is, the
> size of an INTEGER or REAL.
>
> In Fortran 2003, the standard still allows this but recommends the use
> of bytes (I forget the exact wording).
|
| Show full article (1.39Kb) |
| no comments |
|
  |
|
|
  |
Author: JoeJoe Date: Aug 23, 2008 15:33
On Aug 23, 3:22 pm, Steve Lionel wrote:
> Terence wrote:
>> I do wish the standard default for RECL on this mode was still bytes
>> (and not 4-byte words) and not a compiler option. After all the
>> dafault (if not specified) in SEQUENTIAL, is bytes, not 4-byte words -
>> just the opposite!
>
> Since you're using CVF, the default for UNFORMATTED access is 4-byte
> units of RECL=. This has been the mode of DEC compilers for more than
> 30 years and comes from the F77 standard's use of the term "storage
> units" being interpreted as "numerical storage units" - that is, the
> size of an INTEGER or REAL.
>
> In Fortran 2003, the standard still allows this but recommends the use
> of bytes (I forget the exact wording).
>
F2003 also has ISO_FORTRAN_ENV to supply the actual file storage unit
size in bits. Hopefully, this will be commonly available in the near
future.
|
| Show full article (1.16Kb) |
| no comments |
|
|