Most Fortran implementations support a READ statement after a WRITE
without repositioning on a sequential unit; it implies on ENDFILE and
then hits an EOF condition.
Fixes llvm-test-suite/Fortran/gfortran/regression/backspace_2.f.
… input
The current EOF detection method (IsAtEOF()) depends on comparing the
current record number with the record number of the endfile record, if
it is known, which it is for units that have been written and then
rewound for input. For formatted input, this is wrong in the case of a
unit written with in-band newline characters. Rather than scan output
data to count newlines, it's best to just organically determine EOF by
detecting a failed or short read(), as we would have done anyway had the
endfile record number not been known. (I considered resetting the
endfile record number at the point of a REWIND, but not all rewinds are
followed by input; it seems wiser to defer the resetting until an actual
READ takes place.)
Fixes llvm-test-suite/Fortran/gfortran/regression/backslash_2.f90
After a non-advancing WRITE to a unit, ensure that any ENDFILE operation
(explicit or implicit) terminates the record. (All other Fortran
implementations do so except XLF.)
Fixes llvm-test-suite/Fortran/gfortran/regression/advance_6.f90.
A subtle bug in buffer management is being caused by a WRITE on an
unformatted file with variable-length records after one or more
BACKSPACEs and some READs. An attempt at a minor optimization in
BACKSPACE kept the footer of the previous record in the unit's buffer,
in case more BACKSPACEs were to follow. If a later WRITE takes place
instead, the buffer's frame would still cover that footer, but its
content would be lost when the buffer became dirty on its first
modification, and the footer in the file would be overwritten with stale
buffer contents. As WriteFrame() implies the intent to define all the
bytes in the identified range, the trick being used in BACKSPACE with
its adjustment to frameOffsetInFile_ breaks any later WRITE... and so we
just can't do it.
Fixes https://github.com/llvm/llvm-project/issues/72599.
…hecks
An OPEN statement on an existing unit can imply a CLOSE of that unit;
for example, OPEN(5, FILE="mydata", FORM="formatted") should implicitly
close the standard input that had been preconnected to unit 5. When this
happens, later checks in OPEN statement completion that apply only to
existing units should be disabled.
The implementation of BACKSPACE on a variable-length sequential formatted
file has a bug that prevents it from working on an empty record.
Differential Revision: https://reviews.llvm.org/D154750
When flushing output to a non-positionable tty or socket file, reset the
left tab limit. Otherwise, non-advancing output to that file will contain
an increasing amount of leading spaces in each flush. Also, detect
newline characters in stream output, and treat them as record
advancement.
Differential Revision: https://reviews.llvm.org/D148157
Rework the recursive I/O error check on I/O units so that
threads again hold a lock on a unit throughout an I/O statement.
Add an API to the runtime's Lock class implementation for pthreads
to allow detection of solf-deadlock without depending on EDEADLK
or recursive mutexes.
This should fix I/O from OpenMP threads.
Differential Revision: https://reviews.llvm.org/D139477
Currently, the FORT_CONVERT environment variable has the highest priority when
setting the endianness conversion for unformatted files. In discussing the
appropriate priority for the fconvert option, convert specifiers were decided
to take highest priority.
This patch also initializes the open statement convert state to unknown
to disambiguate cases where the convert specifier was not provided from
cases where convert=native was set. This makes it possible to defer to the
environment setting where appropriate.
Differential Revision: https://reviews.llvm.org/D133237
The runtime I/O library correctly handles endianness conversions on payload
data I/O and on the output of sequential record headers and footers, but
does not swap endianness when required when reading sequential record headers
and footers back in for READ and BACKSPACE statements. Mea culpa. Fix.
Fixes https://github.com/llvm/llvm-project/issues/57126
Differential Revision: https://reviews.llvm.org/D132168
When an I/O statement contains a function call that attempts
to perform I/O on the same unit, detect the recursive I/O
and terminate with a useful message rather than deadlocking in
the threading library.
Differential Revision: https://reviews.llvm.org/D131097
An OPEN statement that implies closing a connection must invalidate
the unit's frame buffer so as to prevent stale data from the old
connection from being read into the newly-connected unit.
Differential Revision: https://reviews.llvm.org/D130430
When the I/O runtime is truncating an external file due to an
implied ENDFILE or explicit ENDFILE, ensure that the unit's frame
buffer for the file discards any data that have become obsolete.
This bug caused trouble with ACCESS='STREAM' I/O using POS= on
a WRITE, but it may have not been limited to that scenario.
Differential Revision: https://reviews.llvm.org/D129673
An ENDFILE statement executed when a non-advancing READ has
left the unit in the middle of a record must truncate the file
at that position.
Differential Revision: https://reviews.llvm.org/D129019
First, ExternalFileUnit::SetPosition was being used both as a utility
within the class' member functions as well as an API from I/O statement
processing. Make it private, and add APIs for SetStreamPos and SetDirectRec.
Second, ensure that SetStreamPos for POS= positioning in a stream
doesn't leave the current record number and endfile record number
in an arbitrary state. In stream I/O they are used only to manage
end-of-file detection, and shouldn't produce false positive results
from IsAtEnd() after repositioning.
Differential Revision: https://reviews.llvm.org/D128388
Fortran does have negative unit numbers -- they show up in child I/O
subroutines for defined I/O and for OPEN(NEWUNIT=) -- but the runtime
needs to catch the cases where a negative unit number that wasn't
generated by the runtime is passed in for OPEN or for an I/O statement
that would ordinarily create an anonymous "fort.NNN" file for a
hitherto unseen unit.
Differential Revision: https://reviews.llvm.org/D127788
A REWIND of a unit that's in the middle of a record due to a READ
or WRITE statement with ADVANCE='NO' needs to reset the left tab
limit so that the next transfer takes place at the beginning of
the first record.
Differential Revision: https://reviews.llvm.org/D127783
When an unconnected unit number is used in a BACKSPACE statement
with ERR=, IOSTAT=, &/or IOMSG= control specifiers, don't crash,
but let the program deal with the error.
Differential Revision: https://reviews.llvm.org/D127782
After a recoverable error condition in a READ statement with ADVANCE='NO',
skip the remainder of the current record.
Differential Revision: https://reviews.llvm.org/D127433
Track pending "asynchronous" I/O operation IDs so that WAIT statements can
report errors about bad ID numbers.
Lowering will need to extended to call GetAsynchronousId() for a READ or
WRITE statement with ID=n.
Differential Revision: https://reviews.llvm.org/D127421
Diagnose OPEN(FILE=f) when f is already connected by the same name to
a distinct external I/O unit.
Differential Revision: https://reviews.llvm.org/D127035
The initial size of the file was not being captured as the file position
on which the first output buffer should be framed.
Differential Revision: https://reviews.llvm.org/D127029
An external READ(END=) that hits the end of the file must
also note the virtual position of the endfile record that
has just been discovered, so that a later BACKSPACE statement
won't end up at the wrong record.
Differential Revision: https://reviews.llvm.org/D126146
A BACKSPACE statement on a unit after a READ or WRITE with ADVANCE="NO"
must reset the position to the beginning of the record, not to the
beginning of the previous one.
Differential Revision: https://reviews.llvm.org/D125057
When the last operation on a foramtted sequential or stream file (prior
to an implied or explicit ENDFILE) is a non-advancing WRITE, ensure
that any partial record data is emitted to the file without a line
terminator. Further, when that last record is read with a non-advancing
READ, ensure that it won't raise an end-of-record condition after its
data, but instead will signal an end-of-file.
Differential Revision: https://reviews.llvm.org/D124546
When an error occurs in a formatted sequential output statement
and no output was ever emitted, don't emit a blank record.
This matches the error case behavior of other Fortran compilers.
Differential Revision: https://reviews.llvm.org/D123734
A predicate expression made ENDFILE statements significant
only for sequential files, but it's applicable to formatted
stream output as well.
Differential Revision: https://reviews.llvm.org/D123730
Correct the implementation of non-advancing I/O after some testing
to ensure that T tab edit descriptors are not allowed to back up
into positions of a record prior to where it stood at the beginning
of the I/O statement.
Differential Revision: https://reviews.llvm.org/D123709
Implements UTF-8 encoding and decoding for external units
with OPEN(ENCODING='UTF-8'). This encoding applies to default
CHARACTER values that are not 7-bit ASCII as well as to
the wide CHARACTER kinds 2 and 4. Basic testing is in place
via direct calls to the runtime I/O APIs, but serious checkout
awaits lowering support of the wide CHARACTER kinds.
Differential Revision: https://reviews.llvm.org/D122038
Some I/O error situations are current handled with fatal
runtime asserts, but should be exposed for user program
error recovery.
Differential Revision: https://reviews.llvm.org/D122049
Where possible, I added additional information to the messages to help
programmers figure out what went wrong. I also removed all uses of the word
"bad" from the messages since (to me) that implies a moral judgement rather
than a programming error. I replaced it with either "invalid" or "unsupported"
where appropriate.
Differential Revision: https://reviews.llvm.org/D121493
Rather than reading default character variables in formatted
input one byte at a time via NextInField(), skip and read
them via blocks of available buffer data. This eliminates
a bottleneck that affected reads of large character values.
(It also exposed a problem with sequential reads with RECL=
set on the OPEN statement, so that's fixed too.)
Differential Revision: https://reviews.llvm.org/D121144
A data transfer statement must have REC= in its control list
if (and only if) the unit was opened with ACCESS='DIRECT'.
The runtime wasn't catching this error, but was just silently
advancing to the next record as if the access were sequential.
Differential Revision: https://reviews.llvm.org/D120838
The runtime crashes on several fundamental I/O data transfer statement
control list errors, like list I/O on a direct-access unit, or
input from a write-only unit, &c. These errors should not be fatal
when ERR= or IOSTAT= are present.
This patch creates a new ErroneousIoStatementState class and
uses it for the state of an I/O statement that is doomed to fail
from these errors. If there is no ERR= label or IOSTAT= variable,
the error will be raised at the end of the statement. Data transfer
operations along the way will be no-op failures.
Differential Revision: https://reviews.llvm.org/D120745
Corrects the runtime implementation of I/O on files with
the access mode ACCESS='STREAM'. This is a collection
of edge-case tweaks to ensure that the distinctions between
stream and direct/sequential files, unformatted or formatted,
are respected where appropriate.
Moves NextInField() from io-stmt.h to io-stmt.cpp --
it was getting too big to keep in a header.
This patch exposed a problem with the I/O runtime
on Windows and it was reverted. This version also
fixes that problem; files are now opened on Windows
in binary mode to prevent inadvertent insertions of
carriage returns before line feeds, and those line
endings (CR+LF) are now explicitly generated.
Differential Revision: https://reviews.llvm.org/D119015
Corrects the runtime implementation of I/O on files with
the access mode ACCESS='STREAM'. This is a collection
of edge-case tweaks to ensure that the distinctions between
stream and direct/sequential files, unformatted or formatted,
are respected where appropriate.
Moves NextInField() from io-stmt.h to io-stmt.cpp --
it was getting too big to keep in a header.
Differential Revision: https://reviews.llvm.org/D118834
When RECL= is set on OPEN(), ensure that it:
1) enforces a max output record payload size
(not including header+footer or newline), and
2) causes padding of short output records only
for ACCESS='DIRECT'
The previous code was causing some false overrun errors
and applying padding to sequential/stream output files.
Differential Revision: https://reviews.llvm.org/D118630
After an ENDFILE statement, a WRITE is an error without
a prior BACKSPACE. Also fix the return value for the case
of formatted integer input with no input digits to be false
(exposed by new test).
Differential Revision: https://reviews.llvm.org/D117346
RECL= is required for direct access I/O, but is permitted
as well for sequential I/O, where it is defined by the
standard to specify a maximum record (line) length.
The standard does not say what should happen when an
sequential formatted input record appears whose length is
unequal to RECL= when it is specified.
Precedents from other compilers are unclear: one raises an error,
some honor RECL= as an effective truncation, and a few ignore the
situation. On output, all other compilers tested raised an
error when an attempt is made to emit a record longer than RECL=.
This patch treats RECL= as effective truncation on input and
as a hard limit with error on output, and also ensures that
RECL= can be set *longer* than the actual input record lengths.
Differential Revision: https://reviews.llvm.org/D115102
INQUIRE(POSITION=)'s results need to reflect the POSITION=
specifier used for the OPEN statement until the unit has been
repositioned. Preserve the POSITION= from OPEN and used it
for INQUIRE(POSITION=) until is becomes obsolete.
INQUIRE(PAD=) is implemented here in the case of an unconnected unit
with Fortran 2018 semantics; i.e., "UNDEFINED", rather than Fortran 90's
"YES"/"NO" (see 4.3.6 para 2). Apparent failures with F'90-only tests
will persist with INQUIRE(PAD=); these discrepancies don't seem to warrant
an option or environment variable.
To make the implementation of INQUIRE more closely match the language
in the standard, rename IsOpen() to IsConnected(), and use it explicitly
for the various INQUIRE specifiers.
Differential Revision: https://reviews.llvm.org/D114755
This is a near-universal language extension; external unit 0
is preconnected to the standard error output.
Differential Revision: https://reviews.llvm.org/D114298
The predefined units were not being initialized with FORM='FORMATTED',
so INQUIRE(PAD=) was failing if no I/O had already been done.
INQUIRE(POSITION=) was returning 'REWIND' on stdin/stdout (which
is somewhat defensible from the definition, and is what Intel Fortran
does), but most implementations return 'ASIS'. Change the runtime
to return 'REWIND' only for positionable external files, but 'ASIS'
for terminals, sockets, &c.
Differential Revision: https://reviews.llvm.org/D114028
1. To avoid overwriting the part of the record read in the non advancing read,
the furtherPositionInRecord field must be set to the max of the
furtherPositionInRecord and the positionInRecord at the beginning of the
IO write.
2. To allow any further read to succeed after the write, the unit
beganReadingRecord_ must be set to false when resetting the recordLength
during the write, otherwise, recordLength will not be computed in further
read and an assert is hit (at unit.cpp(398)).
The added unit test exercises both of these scenarios.
Differential Revision: https://reviews.llvm.org/D113740
Profiling a basic internal real input read benchmark shows some
hot spots in the code used to prepare input for decimal-to-binary
conversion, which is of course where the time should be spent.
The library that implements decimal to/from binary conversions has
been optimized, but not the code in the Fortran runtime that calls it,
and there are some obvious light changes worth making here.
Move some member functions from *.cpp files into the class definitions
of Descriptor and IoStatementState to enable inlining and specialization.
Make GetNextInputBytes() the new basic input API within the
runtime, replacing GetCurrentChar() -- which is rewritten in terms of
GetNextInputBytes -- so that input routines can have the
ability to acquire more than one input character at a time
and amortize overhead.
These changes speed up the time to read 1M random reals
using internal I/O from a character array from 1.29s to 0.54s
on my machine, which on par with Intel Fortran and much faster than
GNU Fortran.
Differential Revision: https://reviews.llvm.org/D113697
When processing the devious NAMELIST input
&group logarray = t t t
= 666 /
for LOGICAL::logarray(3) and INTEGER::t, the runtime library
needs to do some look-ahead on the input stream to make sure
that the last "t" on the first line is a truth value rather than
an item name -- which in this case it is. This look-ahead
was implemented in a previous patch but only worked for internal
input cases; this patch implements look-ahead capabilities for
input from an external file, too (and also adjusts repeated
list-directed input items to use this infrastructure, too).
Differential Revision: https://reviews.llvm.org/D113311