In order to properly expose the Hollerith editing item in something like
FORMAT(3I9HHOLLERITH) as its own token, the tokenization routine in the
prescanner has special handling for digit strings followed by letters
("3I" above). This handler's effects are too broad, and prevent a macro
name from being recognized as such in a reported bug; make the test for
a hidden Hollerith more precise.
Fixes https://github.com/llvm/llvm-project/issues/121931.
…d Fortran free-form
OpenMP feature was not enabled in the flang-new for the continuation
line, when we used the continuation line marker in combination of
free-form and OpenMP directive, it was throwing an error. PR is the fix
for that issue.
Added a fix for the following issue
https://github.com/llvm/llvm-project/issues/89559
A free form source line that begins with a macro should still be
classified as a source line, and have its continuation lines work, even
if the macro expands to an empty replacement.
Fixes https://github.com/llvm/llvm-project/issues/117297.
The character 'e' or 'd' (either case) shouldn't be tokenized as part of
a real literal during preprocessing if it is not followed by an
optionally-signed digit string. Doing so prevents it from being
recognized as a macro name, or as the start of one.
Fixes https://github.com/llvm/llvm-project/issues/115676.
When a fixed form source line begins with the name of a macro, don't
emit the usual warning message about a non-decimal character in the
label field. (The check for a macro was only being applied to free form
source lines, and the label field checking was unconditional).
Fixes https://github.com/llvm/llvm-project/issues/116914.
When running fixed-form source through the compiler under -E, don't
aggressively remove space characters, since the parser won't be parsing
the result and some tools might need to see the spaces in the -E
preprocessed output.
Fixes https://github.com/llvm/llvm-project/issues/112279.
(This is a big patch, but it's nearly an NFC. No test results have
changed and all Fortran tests in the LLVM test suites work as expected.)
Allow a parser::Message for a warning to be marked with the
common::LanguageFeature or common::UsageWarning that controls it. This
will allow a later patch to add hooks whereby a driver will be able to
decorate warning messages with the names of its options that enable each
particular warning, and to add hooks whereby a driver can map those
enumerators by name to command-line options that enable/disable the
language feature and enable/disable the messages.
The default settings in the constructor for LanguageFeatureControl were
moved from its header file into its C++ source file.
Hooks for a driver to use to map the name of a feature or warning to its
enumerator were also added.
To simplify the tagging of warnings with their corresponding language
feature or usage warning, to ensure that they are properly controlled by
ShouldWarn(), and to ensure that warnings never issue at code sites in
module files, two new Warn() member function templates were added to
SemanticsContext and other contextual frameworks. Warn() can't be used
before source locations can be mapped to scopes, but the bulk of
existing code blocks testing ShouldWarn() and FindModuleFile() before
calling Say() were convertible into calls to Warn(). The ones that were
not convertible were extended with explicit calls to
Message::set_languageFeature() and set_usageWarning().
Fortran INCLUDE lines have (until now) been treated like #include
directives. This isn't how things work with other Fortran compilers when
running under the -E option for preprocessing only, so stop doing it by
default, and add -fpreprocess-include-lines to turn it back on when
desired.
As specified in the docs,
1) raw_string_ostream is always unbuffered and
2) the underlying buffer may be used directly
( 65b13610a5 for further reference )
Avoid unneeded calls to raw_string_ostream::str(), to avoid excess indirection.
…INCLUDE lines
The compiler current treats an INCLUDE line as essentially a synonym for
a preprocessing #include directive. The causes macros that have been
defined at the point where the INCLUDE line is processed to be replaced
within the text of the included file.
This behavior is surprising to users who expect an INCLUDE line to be
expanded into its contents *after* preprocessing has been applied to the
original source file, with no further macro expansion.
Change INCLUDE line processing to use a fresh instance of Preprocessor
containing no macro definitions except _CUDA in CUDA Fortran
compilations and, if the original file was being preprocessed, the
standard definitions of __FILE__, __LINE__, and so forth.
Codes using traditional C preprocessors will sometimes put a keyword
macro name in a free form continuation line in order to get macro
replacement of part of an identifier, as in
call subr_&
&N&
&(1.)
where N is a keyword macro. f18 already handles this case, but not when
there is white space between the macro name and the following
continuation marker character '&'. Allow white space to appear.
Fixes https://github.com/llvm/llvm-project/issues/106931.
Accept non-breaking space characters (Latin-1 '\xa0', UTF-8 '\xc2'
'\xa0') in source code, converting them into regular spaces in the
cooked character stream when not in character literals.
…tion
See new test. A #line (or #) directive after a line ending with & and
before its continuation shouldn't elicit an error about mismatched
parentheses.
Fixes https://github.com/llvm/llvm-project/issues/100073.
The prescanner checks lines that begin with a keyword macro name to see
whether they should be treated as a comment or compiler directive
instead of a source line. This fails when the potential keyword macro
name is extended with identifier characters via Fortran line
continuation. Disable line continuation during this check.
Don't treat !DIR$ or an active !$ACC, !$OMP, &c. as a comment when they
appear after a semicolon, but instead treat them as a compiler directive
sentinel.
…literals
To help port codes from compilers using pre-ANSI C preprocessors, which
didn't care much about context when replacing macros, support the
replacement of keyword macros in quoted character literals when (and
only when) the name of the keyword macro constitutes the entire
significant portion of a free form continuation line. See the new test
case for a motivating example.
Fixes https://github.com/llvm/llvm-project/issues/96781.
…ne continuation
Allow preprocessing directives to appear between a source line and its
continuation, including conditional compilation directives (#if, #ifdef,
&c.).
Fixes https://github.com/llvm/llvm-project/issues/95476.
…rective
Implement fixed-form line continuation when the continuation line is the
result of text produced by an #include or other preprocessing directive.
This accommodates the somewhat common practice of putting dummy or
actual arguments into a header file and #including it into several code
sites.
Fixes https://github.com/llvm/llvm-project/issues/78928.
…e produced by keyword macro replacement
When later initial keyword macro replacement will yield a line that is
not Fortran source, don't interpret "&" as a Fortran source line
continuation marker during tokenization of the line.
Fixes https://github.com/llvm/llvm-project/issues/82579.
Some applications like to use a CPP-style #include directive to pull in
a common list of arguments, dummy arguments, or COMMON block variables
after a free-form & line continuation marker. This works naturally with
compilers that run an actual cpp pass over the input before doing
anything specific to Fortran, but it's a case that I missed with this
integrated preprocessor.
…Warn()
Many warning messages were being emitted unconditionally. Ensure that
all warnings are conditional on a true result from a call to
common::LanguageFeatureControl::ShouldWarn() so that it is easy for a
driver to disable them all, or, in the future, to provide per-warning
control over them.
When prescanning a Fortran source file with preprocessing enabled in
free source form, interpret a line-ending backslash as a source line
continuation marker as a C preprocessor would. This usage isn't
completely portable, but it is supported by GNU Fortran and appears in
the source for FPM package manager.
I implemented legacy "token pasting" via line continuation for
call prefix&
&MACRO&
&suffix(1)
in a recent patch; this patch addresses the related cases
call prefix&
&MACRO&
&(1)
and
call &
&MACRO&
&suffix(1)
Fixes the latest https://github.com/llvm/llvm-project/issues/79590.
https://github.com/llvm/llvm-project/issues/78927 contains a case of
fixed-form source in which a Hollerith literal is mistakenly tokenized,
leading to grief later due to apparently unbalanced parentheses.
The source looks like "REAL*8 R8HEAP(SCRSIZE)" and the Hollerith literal
is misrecognized as such because it follows "8R". In order to properly
tokenize Hollerith literals in old comma-free FORMAT statements like "1
FORMAT(3I5HFLANG)", the tokenizer in the prescanner treats a letter
after an integer token ("3I") as a special case. The fix is to do this
only when the characters involved are nested in parentheses and
Hollerith is a possibility.
Fixes https://github.com/llvm/llvm-project/issues/78927.
Some Fortran codes use line continuation as a form of token pasting; see
https://github.com/llvm/llvm-project/issues/78797. This works in
compilers that run a C-like preprocessor and then apply line
continuation to its output; f18 implements line continuation during
tokenization and preprocessing, but can still handle this case.
In the rare case when an identifier is split across two or more
continuation lines, this patch allows its parts to be distinct
preprocessing tokens for the purpose of macro replacemnt. They (or their
replacement texts) can be effectively rejoined later as a single
identifier when the cooked character stream is tokenized in parsing.
Fixes https://github.com/llvm/llvm-project/issues/78797.
The prescanner allows multiple spaces within a compiler directive, but
not between the directive's sentinel (e.g., !DIR$) and the directive's
first token.
Fixes https://github.com/llvm/llvm-project/issues/76537.
For a character literal that is split over more than one source line
with free form line continuation using '&'
at the end of one line but missing the standard-required '&' on the
continuation line, also handle the case of spaces at the beginning of
the continuation line.
For example,
PRINT *, 'don'&
't poke the bear'
now prints "don't poke the bear", like nearly all other Fortran
compilers do.
This is not strictly standard conforming behavior, and the compiler
emits a portability warning with -pedantic.
Fixes llvm-test-suite/Fortran/gfortran/regression/continuation_1.f90,
.../continuation_12.f90, and .../continuation_13.f90.
Before emitting a warning message, code should check that the usage in
question should be diagnosed by calling ShouldWarn(). A fair number of
sites in the code do not, and can emit portability warnings
unconditionally, which can confuse a user that hasn't asked for them
(-pedantic) and isn't terribly concerned about portability *to* other
compilers.
Add calls to ShouldWarn() or IsEnabled() around messages that need them,
and add -pedantic to tests that now require it to test their portability
messages, and add more expected message lines to those tests when
-pedantic causes other diagnostics to fire.
At present, the prescanner emits an error if a source line or compiler
directive, after macro replacement or not, contains a token with a
non-Fortran character. In the particular case of the '!' character, the
code that checks for bad character will accept the '!' if it appears
after a ';', since the '!' might begin a compiler directive.
This current implementation fails when a compiler directive appears
after some other character that might (by means of further source
processing not visible to the prescanner) be replaced with a ';' or
newline.
Extend the bad character check for '!' to actually check for a compiler
directive sentinel instead.
…fter macro expansion
When compiler directives (!$omp) and/or their continuations (!$omp &)
are produced by macro expansion, handle those continuations. Also allow
a continuation marker (&) to appear in a macro actual argument.
When the old extension of D debug lines in fixed form source is enabled,
recognize continuation lines that begin with a D.
(No test is added, since the community's driver doesn't support the GNU
Fortran option -fd-lines-as-code.)
D155499 fixed an issue with implicit continuations. The fixes included a
nested parenthesis check during definition of a macro which is then
carried over in the scanner state.
This leads to the following corner case to fail:
subroutine foo(a, d)
implicit none
integer :: a
integer :: d
! An implicit continuation won't be considered unless
! the definition of "bar" above is removed/commented
call sub(1,
2)
end subroutine foo
The definition of bar is indeed unbalanced but it is not even used in
the code, so it should not impact whether we apply implicit continuation
in the expansion of sub.
This change aims at addressing this issue by removing the balance check
and constraining a bit more when we consider implicit continuations:
only when we see a left parenthesis after a function-like macro, not a
object-like macro. In this case I think it is OK to (unconditionally)
implicitly continue to the next line in search of the corresponding
right parenthesis. This is, to my understanding, similar to what the C
preprocessor would do according to the description in [1].
[1] https://www.spinellis.gr/blog/20060626/
Differential Revision: https://reviews.llvm.org/D157414
The prescanner performs implicit line continuation when it looks
like the parenthesized arguments of a call to a function-like macro
may span multiple lines. In an attempt to work more like a
Fortran-oblivious C preprocessor, the prescanner will act as if
the following lines had been continuations so that the function-like
macro could be invoked.
This still seems like a good idea, but a recent bug report on
LLVM's GitHub issue tracker shows one way in which it could trigger
inadvertently and mess up a program. So this patch makes the
conditions for implicit line continuation much more strict.
First, the leading parenthesis has to have been preceded by an
identifier that's known to be a macro name. (It doesn't have to
be a function-like macro, since it's possible for a keyword-like
macro to expand to the name of a function-like macro.) Second,
no macro definition can ever have had unbalanced parentheses in
its replacement text.
Also cleans up some parenthesis recognition code to fix some
issues found in testing, so that a token with leading or trailing
spaces can still be recognized as a parenthesis or comma.
Fixes https://github.com/llvm/llvm-project/issues/63844.
Differential Revision: https://reviews.llvm.org/D155499
A quotation mark can appear in a Fortran character literal by doubling
it; for example, PRINT *, "'""'" prints '"'. When those doubled
quotation marks are split by a free form line continuation, the
continuation line should have an ampersand before the second quotation
mark. But most compilers, including this one, allow the second
quotation mark to appear as the first character on the continuation
line, too.
So this works:
print *, "'"&
"'"
but it really should be written as:
print *, "'"&
&"'"
Emit a portability warning and document that we support this near-universal
extension.
Differential Revision: https://reviews.llvm.org/D155973
Fix some problems with INCLUDE line recognition pointed out by some
recently-added tests to the LLVM test suite.
Differential Revision: https://reviews.llvm.org/D155497
Extend the SourceFile class to take account of #line directives
when computing source file positions for error messages.
Adjust the output of #line directives to -E output so that they
reflect any #line directives that were in the input.
Differential Revision: https://reviews.llvm.org/D153910
Begin upstreaming of CUDA Fortran support in LLVM Flang.
This first patch implements parsing for CUDA Fortran syntax,
including:
- a new LanguageFeature enum value for CUDA Fortran
- driver change to enable that feature for *.cuf and *.CUF source files
- parse tree representation of CUDA Fortran syntax
- dumping and unparsing of the parse tree
- the actual parsers for CUDA Fortran syntax
- prescanning support for !@CUF and !$CUF
- basic sanity testing via unparsing and parse tree dumps
... along with any minimized changes elsewhere to make these
work, mostly no-op cases in common::visitors instances in
semantics and lowering to allow them to compile in the face
of new types in variant<> instances in the parse tree.
Because CUDA Fortran allows the kernel launch chevron syntax
("call foo<<<blocks, threads>>>()") only on CALL statements and
not on function references, the parse tree nodes for CallStmt,
FunctionReference, and their shared Call were rearranged a bit;
this caused a fair amount of one-line changes in many files.
More patches will follow that implement CUDA Fortran in the symbol
table and name resolution, and then semantic checking.
Differential Revision: https://reviews.llvm.org/D150159
Modify the prescanner to allow compiler directives to appear in
macro expansions, and adjust the parser to accept a semicolon
as a directive terminator.
Differential Revision: https://reviews.llvm.org/D150780
f18 accepts statement labels in fixed form source even if they follow
a semicolon -- i.e., they're not in the fixed form's label field.
Emit a warning for such usage.
Differential Revision: https://reviews.llvm.org/D143817
A fixed form continuation line should not be the first line of a statement;
emit a warning if it looks like one is so that the programmer can look
for a missing card.
Differential Revision: https://reviews.llvm.org/D142763
f18 doesn't have any limit on continuation lines in fixed or free form
source (other than available memory), but the standard does. Emit
a portability warning when it is exceeded.
Differential Revision: https://reviews.llvm.org/D139055
When including debug lines as code, the `D` should be considered as
a white space. Currently an error was raised about bad labels because
it the `D` remained a `D` when considering the source line as code.
Differential Revision: https://reviews.llvm.org/D122711