I implemented legacy "token pasting" via line continuation for
call prefix&
&MACRO&
&suffix(1)
in a recent patch; this patch addresses the related cases
call prefix&
&MACRO&
&(1)
and
call &
&MACRO&
&suffix(1)
Fixes the latest https://github.com/llvm/llvm-project/issues/79590.
https://github.com/llvm/llvm-project/issues/78927 contains a case of
fixed-form source in which a Hollerith literal is mistakenly tokenized,
leading to grief later due to apparently unbalanced parentheses.
The source looks like "REAL*8 R8HEAP(SCRSIZE)" and the Hollerith literal
is misrecognized as such because it follows "8R". In order to properly
tokenize Hollerith literals in old comma-free FORMAT statements like "1
FORMAT(3I5HFLANG)", the tokenizer in the prescanner treats a letter
after an integer token ("3I") as a special case. The fix is to do this
only when the characters involved are nested in parentheses and
Hollerith is a possibility.
Fixes https://github.com/llvm/llvm-project/issues/78927.
Some Fortran codes use line continuation as a form of token pasting; see
https://github.com/llvm/llvm-project/issues/78797. This works in
compilers that run a C-like preprocessor and then apply line
continuation to its output; f18 implements line continuation during
tokenization and preprocessing, but can still handle this case.
In the rare case when an identifier is split across two or more
continuation lines, this patch allows its parts to be distinct
preprocessing tokens for the purpose of macro replacemnt. They (or their
replacement texts) can be effectively rejoined later as a single
identifier when the cooked character stream is tokenized in parsing.
Fixes https://github.com/llvm/llvm-project/issues/78797.
The prescanner allows multiple spaces within a compiler directive, but
not between the directive's sentinel (e.g., !DIR$) and the directive's
first token.
Fixes https://github.com/llvm/llvm-project/issues/76537.
For a character literal that is split over more than one source line
with free form line continuation using '&'
at the end of one line but missing the standard-required '&' on the
continuation line, also handle the case of spaces at the beginning of
the continuation line.
For example,
PRINT *, 'don'&
't poke the bear'
now prints "don't poke the bear", like nearly all other Fortran
compilers do.
This is not strictly standard conforming behavior, and the compiler
emits a portability warning with -pedantic.
Fixes llvm-test-suite/Fortran/gfortran/regression/continuation_1.f90,
.../continuation_12.f90, and .../continuation_13.f90.
Before emitting a warning message, code should check that the usage in
question should be diagnosed by calling ShouldWarn(). A fair number of
sites in the code do not, and can emit portability warnings
unconditionally, which can confuse a user that hasn't asked for them
(-pedantic) and isn't terribly concerned about portability *to* other
compilers.
Add calls to ShouldWarn() or IsEnabled() around messages that need them,
and add -pedantic to tests that now require it to test their portability
messages, and add more expected message lines to those tests when
-pedantic causes other diagnostics to fire.
At present, the prescanner emits an error if a source line or compiler
directive, after macro replacement or not, contains a token with a
non-Fortran character. In the particular case of the '!' character, the
code that checks for bad character will accept the '!' if it appears
after a ';', since the '!' might begin a compiler directive.
This current implementation fails when a compiler directive appears
after some other character that might (by means of further source
processing not visible to the prescanner) be replaced with a ';' or
newline.
Extend the bad character check for '!' to actually check for a compiler
directive sentinel instead.
…fter macro expansion
When compiler directives (!$omp) and/or their continuations (!$omp &)
are produced by macro expansion, handle those continuations. Also allow
a continuation marker (&) to appear in a macro actual argument.
When the old extension of D debug lines in fixed form source is enabled,
recognize continuation lines that begin with a D.
(No test is added, since the community's driver doesn't support the GNU
Fortran option -fd-lines-as-code.)
D155499 fixed an issue with implicit continuations. The fixes included a
nested parenthesis check during definition of a macro which is then
carried over in the scanner state.
This leads to the following corner case to fail:
subroutine foo(a, d)
implicit none
integer :: a
integer :: d
! An implicit continuation won't be considered unless
! the definition of "bar" above is removed/commented
call sub(1,
2)
end subroutine foo
The definition of bar is indeed unbalanced but it is not even used in
the code, so it should not impact whether we apply implicit continuation
in the expansion of sub.
This change aims at addressing this issue by removing the balance check
and constraining a bit more when we consider implicit continuations:
only when we see a left parenthesis after a function-like macro, not a
object-like macro. In this case I think it is OK to (unconditionally)
implicitly continue to the next line in search of the corresponding
right parenthesis. This is, to my understanding, similar to what the C
preprocessor would do according to the description in [1].
[1] https://www.spinellis.gr/blog/20060626/
Differential Revision: https://reviews.llvm.org/D157414
The prescanner performs implicit line continuation when it looks
like the parenthesized arguments of a call to a function-like macro
may span multiple lines. In an attempt to work more like a
Fortran-oblivious C preprocessor, the prescanner will act as if
the following lines had been continuations so that the function-like
macro could be invoked.
This still seems like a good idea, but a recent bug report on
LLVM's GitHub issue tracker shows one way in which it could trigger
inadvertently and mess up a program. So this patch makes the
conditions for implicit line continuation much more strict.
First, the leading parenthesis has to have been preceded by an
identifier that's known to be a macro name. (It doesn't have to
be a function-like macro, since it's possible for a keyword-like
macro to expand to the name of a function-like macro.) Second,
no macro definition can ever have had unbalanced parentheses in
its replacement text.
Also cleans up some parenthesis recognition code to fix some
issues found in testing, so that a token with leading or trailing
spaces can still be recognized as a parenthesis or comma.
Fixes https://github.com/llvm/llvm-project/issues/63844.
Differential Revision: https://reviews.llvm.org/D155499
A quotation mark can appear in a Fortran character literal by doubling
it; for example, PRINT *, "'""'" prints '"'. When those doubled
quotation marks are split by a free form line continuation, the
continuation line should have an ampersand before the second quotation
mark. But most compilers, including this one, allow the second
quotation mark to appear as the first character on the continuation
line, too.
So this works:
print *, "'"&
"'"
but it really should be written as:
print *, "'"&
&"'"
Emit a portability warning and document that we support this near-universal
extension.
Differential Revision: https://reviews.llvm.org/D155973
Fix some problems with INCLUDE line recognition pointed out by some
recently-added tests to the LLVM test suite.
Differential Revision: https://reviews.llvm.org/D155497
Extend the SourceFile class to take account of #line directives
when computing source file positions for error messages.
Adjust the output of #line directives to -E output so that they
reflect any #line directives that were in the input.
Differential Revision: https://reviews.llvm.org/D153910
Begin upstreaming of CUDA Fortran support in LLVM Flang.
This first patch implements parsing for CUDA Fortran syntax,
including:
- a new LanguageFeature enum value for CUDA Fortran
- driver change to enable that feature for *.cuf and *.CUF source files
- parse tree representation of CUDA Fortran syntax
- dumping and unparsing of the parse tree
- the actual parsers for CUDA Fortran syntax
- prescanning support for !@CUF and !$CUF
- basic sanity testing via unparsing and parse tree dumps
... along with any minimized changes elsewhere to make these
work, mostly no-op cases in common::visitors instances in
semantics and lowering to allow them to compile in the face
of new types in variant<> instances in the parse tree.
Because CUDA Fortran allows the kernel launch chevron syntax
("call foo<<<blocks, threads>>>()") only on CALL statements and
not on function references, the parse tree nodes for CallStmt,
FunctionReference, and their shared Call were rearranged a bit;
this caused a fair amount of one-line changes in many files.
More patches will follow that implement CUDA Fortran in the symbol
table and name resolution, and then semantic checking.
Differential Revision: https://reviews.llvm.org/D150159
Modify the prescanner to allow compiler directives to appear in
macro expansions, and adjust the parser to accept a semicolon
as a directive terminator.
Differential Revision: https://reviews.llvm.org/D150780
f18 accepts statement labels in fixed form source even if they follow
a semicolon -- i.e., they're not in the fixed form's label field.
Emit a warning for such usage.
Differential Revision: https://reviews.llvm.org/D143817
A fixed form continuation line should not be the first line of a statement;
emit a warning if it looks like one is so that the programmer can look
for a missing card.
Differential Revision: https://reviews.llvm.org/D142763
f18 doesn't have any limit on continuation lines in fixed or free form
source (other than available memory), but the standard does. Emit
a portability warning when it is exceeded.
Differential Revision: https://reviews.llvm.org/D139055
When including debug lines as code, the `D` should be considered as
a white space. Currently an error was raised about bad labels because
it the `D` remained a `D` when considering the source line as code.
Differential Revision: https://reviews.llvm.org/D122711
Using recently established message severity codes, upgrade
non-fatal messages to usage and portability warnings as
appropriate.
Differential Revision: https://reviews.llvm.org/D121246
Source lines with mismatched parentheses are hard cases for error
recovery in parsing, and the best error message (viz.,
"here's an unmatched parenthesis") can be emitted from the
prescanner.
Differential Revision: https://reviews.llvm.org/D111254#3046173
From subclause 6.3.3.5: a program unit END statement cannot be
continued in fixed form, and other statements cannot have initial
lines that look like program unit END statements. I think this
is to avoid violating assumptions that are important to legacy
compilers' statement classification routines.
Differential Revision: https://reviews.llvm.org/D109933
Ticking off a Parser TODO: Preprocessor::Directive()'s Prescanner
argument should be a reference, not a pointer.
Differential Revision: https://reviews.llvm.org/D109094
Make the #include "file" preprocessing directive begin its
search in the same directory as the file containing the directive,
as other preprocessors and our Fortran INCLUDE statement do.
Avoid current working directory for all source files except the original.
Resolve tests.
Differential Revision: https://reviews.llvm.org/D95481
Make the #include "file" preprocessing directive begin its
search in the same directory as the file containing the directive,
as other preprocessors and our Fortran INCLUDE statement do.
Avoid current working directory for all source files after the original.
Differential Revision: https://reviews.llvm.org/D95388
Hew more closely to the C17 standard; perform macro replacement
of arguments to function-like macros unless they're being stringified
or pasted. Test with a model "assert" macro idiom that exposed
the problem.
Differential Revision: https://reviews.llvm.org/D87650
The std::string holding the content of a CookedSource no longer
needs to be exposed in its API after the recent work that allows
the parsing context to hold multiple instances of a CookedSource.
So clean the API. These changes were extracted from some work in
progress that was made easier by the API changes.
Differential Revision: https://reviews.llvm.org/D87635
These are owned by an instance of a new class AllCookedSources.
This removes the need for a Scope to own a string containing
a module's cooked source stream, and will enable errors to be
emitted when parsing module files in the future.
Differential Revision: https://reviews.llvm.org/D86891
When an illegal character appears in Fortran source (after
preprocessing), catch and report it in the prescanning phase
rather than leaving it for the parser to cope with.
Differential Revision: https://reviews.llvm.org/D86553
If the label field is empty, and macro replacement occurs,
the rescanned text might be misclassified as a comment card
if it happens to begin with a C or a D. Insert a leading
space into these otherwise empty label fields.
Fixes https://bugs.llvm.org/show_bug.cgi?id=47173
To prevent mistokenization of CHARACTER*2HXY as a Hollerith
literal constant while allowing it in DATA A/2*2HXY/, there's
a little state that tracks whether a / has been seen earlier
in the same statement. But it was being reset on each line,
not statement, so Hollerith in a DATA statement continuation
line was incorrectly tokenized. Fixed.
Differential Revision: https://reviews.llvm.org/D85571
The prescanner looks for implicit continuation lines when
there are unclosed parentheses at the end of a line, so that
source preprocessing macro references with arguments that span
lines are recognized. The condition that determines this
implicit continuation has been put into a predicate member
function and corrected to apply only when the following line
is source (not a preprocessing directive, comment, &c.).
Fixes bugzilla #46768.
Reviewed By: sscalpone
Differential Revision: https://reviews.llvm.org/D84280
In fixed form source, complain when a label digit appears
outside the label field & when a non-digit appears in the label
field.
Reviewed By: sscalpone
Differential Revision: https://reviews.llvm.org/D84283
Fixed-form line continuation was not working when the
preceding line was a bare label.
Reviewed By: tskeith
Differential Revision: https://reviews.llvm.org/D82687
The previous code had handling for cases when too many file descriptors may be
opened; this is not necessary with MemoryBuffer as the file descriptors are
closed after the mapping occurs. MemoryBuffer also internally handles the case
where a file is small and therefore an mmap is bad for performance; such files
are simply copied to memory after being opened.
Many places elsewhere in the code assume that the buffer is not empty, and the
old file opening code handles this by replacing an empty file with a buffer
containing a single newline. That behavior is now kept in the new MemoryBuffer
based code.
Original-commit: flang-compiler/f18@d34df84351
Reviewed-on: https://github.com/flang-compiler/f18/pull/1032
This patch replaces the occurrence of std::ostream by llvm::raw_ostream.
In LLVM Coding Standards[1] "All new code should use raw_ostream
instead of ostream".[1]
As a consequence, this patch also replaces the use of:
std::stringstream by llvm::raw_string_ostream or llvm::raw_ostream*
std::ofstream by llvm::raw_fd_ostream
std::endl by '\n' and flush()[2]
std::cout by llvm::outs() and
std::cerr by llvm::errs()
It also replaces std::strerro by llvm::sys::StrError** , but NOT in Fortran
runtime libraries
*std::stringstream were replaced by llvm::raw_ostream in all methods that
used std::stringstream as a parameter. Moreover, it removes the pointers to
these streams.
[1]https://llvm.org/docs/CodingStandards.html
[2]https://releases.llvm.org/2.5/docs/CodingStandards.html#ll_avoidendl
Signed-off-by: Caroline Concatto <caroline.concatto@arm.com>
Running clang-format-7
Signed-off-by: Caroline Concatto <caroline.concatto@arm.com>
Removing residue of ostream library
Signed-off-by: Caroline Concatto <caroline.concatto@arm.com>
Original-commit: flang-compiler/f18@a3507d44b8
Reviewed-on: https://github.com/flang-compiler/f18/pull/1047