This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
Fortran/UnitTests/fcvs21_f95/
-
UnitTests/
-
fcvs21_f95/
-
FM905.f
-
FM905.reference_output
-
README

Differential D128262

[Fortran] Avoid digits in character constant
AbandonedPublic

Authored by rovka on Jun 21 2022, 5:20 AM.

Download Raw Diff

Details

Reviewers

awarzynski
kiranchandramohan
Meinersbur
sscalpone
naromero77

Summary

Test 9 from FM905.f tests the splitting of long character strings at the
80-character boundary when using list-directed output. There are
2 issues that make it difficult to keep this test in the test-suite in
its current form:

One of the character constants contains only digits, which makes

fpcmp interpret it as a very large number rather than a character
constant. This is only really a problem because of issue #2:

The standard is not strict about the number of spaces that can

crop up in list-directed output, which means that different compilers
will reach column 80 at different points in the character constant, and
therefore fpcmp will either see one very large number or 2 smaller but
somewhat arbitrary numbers, depending on where the newline was inserted.

This patch changes the problematic character constant to use letters
instead of digits. Since whitespaces are ignored, this fixes the issue.

Diff Detail

Event Timeline

rovka created this revision.Jun 21 2022, 5:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2022, 5:20 AM

Herald added a subscriber: jdoerfert. · View Herald Transcript

rovka requested review of this revision.Jun 21 2022, 5:20 AM

How does fpcmp fit into this issue? Is that in the checker?

In D128262#3599409, @sscalpone wrote:

How does fpcmp fit into this issue? Is that in the checker?

Yes, fpcmp is the tool we use for comparing the results with the reference output. It treats all the numbers it encounters as floats.

rovka added a parent revision: D128260: [Fortran] Ignore whitespace in FCVS test results.Jun 22 2022, 4:28 AM

Hey Diana, cheers for looking into this!

Please bear with me, as this is not my are of expertise :)

Test 9 from FM905.f tests the splitting of long character strings at the 80-character boundary when using list-directed output.

Where is this boundary imposed? Is it a limit for the input of for the output? The get a better idea, I've run this benchmark using flang-new and gfortran:

flang-new

                COMPUTED=
SHORTTHIS IS A LONGER CHARACTER STRING1234567890123456789012345678901234567890
12345678901234567890123456789012
                CORRECT=
SHORT  THIS IS A LONGER CHARACTER STRING 123456789012345678901234567890123456789
012345678901234567890123456789012

gfortran

                COMPUTED=
SHORTTHIS IS A LONGER CHARACTER STRING123456789012345678901234567890123456789012345678901234567890123456789012
                CORRECT=
SHORT  THIS IS A LONGER CHARACTER STRING 123456789012345678901234567890123456789
012345678901234567890123456789012

So the expected output was generated with gfortran in mind. But why is the output from flang-new different? You referred to white spaces in list-directed output, but the formatting seems identical with respect to white spaces.

In D128262#3604728, @awarzynski wrote:

Hey Diana, cheers for looking into this!

Please bear with me, as this is not my are of expertise :)

Not mine either :D

Test 9 from FM905.f tests the splitting of long character strings at the 80-character boundary when using list-directed output.

Where is this boundary imposed? Is it a limit for the input of for the output? The get a better idea, I've run this benchmark using flang-new and gfortran:

flang-new
                COMPUTED=
SHORTTHIS IS A LONGER CHARACTER STRING1234567890123456789012345678901234567890
12345678901234567890123456789012
                CORRECT=
SHORT  THIS IS A LONGER CHARACTER STRING 123456789012345678901234567890123456789
012345678901234567890123456789012
gfortran
                COMPUTED=
SHORTTHIS IS A LONGER CHARACTER STRING123456789012345678901234567890123456789012345678901234567890123456789012
                CORRECT=
SHORT  THIS IS A LONGER CHARACTER STRING 123456789012345678901234567890123456789
012345678901234567890123456789012
So the expected output was generated with gfortran in mind. But why is the output from flang-new different? You referred to white spaces in list-directed output, but the formatting seems identical with respect to white spaces.

You're right, it's not about whitespace. I don't remember why I thought one of them was splitting at the 0 and one at the 9, I must've accidentally looked at the value printed after "CORRECT=". I guess the difference is only whether or not it splits at all. From what I managed to google up, gfortran never introduces a newline, but ifort does unless told otherwise on the command line. As you can see, flang also introduces a newline by default. To be honest, I don't understand the standardese around list-directed output very well, and since both behaviours exist in the wild I think we should try to accept both. WDYT? (I'll try to update the summary accordingly if you agree with the approach)

In D128262#3604875, @rovka wrote:

From what I managed to google up, gfortran never introduces a newline, but ifort does unless told otherwise on the command line. As you can see, flang also introduces a newline by default.

So, we know that the format of the output in this case is compiler-specific and the reference output was generated using gfortran. Generalizing it so that it works for other compilers makes sense to me.

think we should try to accept both. WDYT?

How can we achieve this? Wouldn't that require a dedicated "reference" file for every compiler that we want to support here? Also, by replacing digits with letters you are basically making fpcmp ignore this particular bit of output, right? So:

why not delete this particular test line if it's to be ignored anyway?
why is the generated output only really verified with fpcmp?

I've scanned llvm-test-suite and I don't see any obvious way to use any other DIFFPROGR. So it sounds like we can only use fpcmp for now and this tool shouldn't be comparing strings, should it? Perhaps switching from SingleSource CMake logic in the test suite to e.g. TestFile would help, but that's IMO outside the scope of this patch.

You can get list-directed output records to wrap at any length. The default is 79 but it can be overridden via FORT_FMT_RECL=n in the environment.

In D128262#3605337, @awarzynski wrote:

In D128262#3604875, @rovka wrote:

From what I managed to google up, gfortran never introduces a newline, but ifort does unless told otherwise on the command line. As you can see, flang also introduces a newline by default.

So, we know that the format of the output in this case is compiler-specific and the reference output was generated using gfortran. Generalizing it so that it works for other compilers makes sense to me.

think we should try to accept both. WDYT?

How can we achieve this? Wouldn't that require a dedicated "reference" file for every compiler that we want to support here? Also, by replacing digits with letters you are basically making fpcmp ignore this particular bit of output, right? So:

why not delete this particular test line if it's to be ignored anyway?

why is the generated output only really verified with fpcmp?

No, it's not ignored, fpcmp handles strings correctly (it errors out if there are any differences). In this case, it only ignores the whitespaces because I'm telling it to, but it still checks that the non-space characters are there.

I've scanned llvm-test-suite and I don't see any obvious way to use any other DIFFPROGR. So it sounds like we can only use fpcmp for now and this tool shouldn't be comparing strings, should it? Perhaps switching from SingleSource CMake logic in the test suite to e.g. TestFile would help, but that's IMO outside the scope of this patch.

Yeah, definitely outside the scope of this patch :)

In D128262#3605429, @klausler wrote:

You can get list-directed output records to wrap at any length. The default is 79 but it can be overridden via FORT_FMT_RECL=n in the environment.

That's cool, thanks :)
OTOH, that's specific to flang, right? So we'd have to tell people to add that to their environment when running the test-suite with flang. With other compilers, people will have to find alternative solutions on their own, if needed. With this patch, things just work, for any compiler, and we're not really changing the meaning of the test, since it says it's checking character constants (we're just switching to characters that don't confuse our existing diff tool).

In D128262#3605337, @awarzynski wrote:

I've scanned llvm-test-suite and I don't see any obvious way to use any other DIFFPROGR. So it sounds like we can only use fpcmp for now and this tool shouldn't be comparing strings, should it? Perhaps switching from SingleSource CMake logic in the test suite to e.g. TestFile would help, but that's IMO outside the scope of this patch.

SingleMultiSource is a skeleton for the simplest of test cases (program consists of single source file, stdout/err is compared to reference in file)
I've integrated other validators than fpmcp (https://github.com/llvm/llvm-test-suite/blob/8e4703af93b04d72e225215e67a89973e84e59fd/External/SPEC/SpecCPU2017.cmake#L258) using llvm_test_verify. fpcmp has its issues as well in that is ALWAYS (even with 0 tolerance) parses anything that looks like a floating-point number (as you noticed). This has been brought up before but nobody has changed it yet. I think that would be the cleaner (and overdue) solution.

@rovka
Since I changed fpcmp before (basically I rewrote the entire comparison algorithm once), I could help with that.

In D128262#3623448, @Meinersbur wrote:

In D128262#3605337, @awarzynski wrote:

I've scanned llvm-test-suite and I don't see any obvious way to use any other DIFFPROGR. So it sounds like we can only use fpcmp for now and this tool shouldn't be comparing strings, should it? Perhaps switching from SingleSource CMake logic in the test suite to e.g. TestFile would help, but that's IMO outside the scope of this patch.

SingleMultiSource is a skeleton for the simplest of test cases (program consists of single source file, stdout/err is compared to reference in file)
I've integrated other validators than fpmcp (https://github.com/llvm/llvm-test-suite/blob/8e4703af93b04d72e225215e67a89973e84e59fd/External/SPEC/SpecCPU2017.cmake#L258) using llvm_test_verify. fpcmp has its issues as well in that is ALWAYS (even with 0 tolerance) parses anything that looks like a floating-point number (as you noticed). This has been brought up before but nobody has changed it yet. I think that would be the cleaner (and overdue) solution.

@rovka
Since I changed fpcmp before (basically I rewrote the entire comparison algorithm once), I could help with that.

Ok, feel free to have a stab at it :) I'm going to be on vacation until the 20th of July, so I'll also consider the other patch on hold until you fix this one (since it might need at least a rebase afterwards). Thanks for looking into it!

I just saw that the reference output also contains floating point numbers (13.6, 12.4,15.25). Maybe we indeed new to process the file in floating-point mode?

Meinersbur mentioned this in D129017: [fpcmp] Use non-floating point parsing by default..Jul 1 2022, 2:54 PM

Meinersbur mentioned this in rT6703097ffa34: [fpcmp] Use non-floating point parsing by default..Jul 18 2022, 2:30 PM

This isn't needed anymore, thanks for having a look!

Revision Contents

Path

Size

Fortran/

UnitTests/

fcvs21_f95/

FM905.f

8 lines

FM905.reference_output

6 lines

README

2 lines

Diff 438650

Fortran/UnitTests/fcvs21_f95/FM905.f

Show First 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	CT008* TEST 8 - CHARACTER CONSTANT CONTAINING EMBEDDED ' 02070905
WRITE (NUVI, 70081) 02150905		WRITE (NUVI, 70081) 02150905
70081 FORMAT(" ",6X,"5 O'CLOCK") 02160905		70081 FORMAT(" ",6X,"5 O'CLOCK") 02160905
CT009* TEST 9 - CHARACTER CONSTANT SPILLING OVER RECORD BOUNDARY 02170905		CT009* TEST 9 - CHARACTER CONSTANT SPILLING OVER RECORD BOUNDARY 02170905
IVTNUM = 9 02180905		IVTNUM = 9 02180905
WRITE (NUVI, 80004) IVTNUM 02190905		WRITE (NUVI, 80004) IVTNUM 02190905
WRITE (NUVI, 80020) 02200905		WRITE (NUVI, 80020) 02200905
A5VK = 'SHORT' 02210905		A5VK = 'SHORT' 02210905
A33VK = 'THIS IS A LONGER CHARACTER STRING' 02220905		A33VK = 'THIS IS A LONGER CHARACTER STRING' 02220905
A82VK = '123456789012345678901234567890123456789012345678901234502230905		A82VK = 'abcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcde02230905
167890123456789012' 02240905		1fghijabcdefghijab' 02240905
WRITE(NUVI, *) A5VK, A33VK, A82VK 02250905		WRITE(NUVI, *) A5VK, A33VK, A82VK 02250905
IVINSP = IVINSP + 1 02260905		IVINSP = IVINSP + 1 02260905
WRITE (NUVI, 80022) 02270905		WRITE (NUVI, 80022) 02270905
WRITE (NUVI, 70091) 02280905		WRITE (NUVI, 70091) 02280905
70091 FORMAT(" ", "SHORT THIS IS A LONGER CHARACTER STRING" , 02290905		70091 FORMAT(" ", "SHORT THIS IS A LONGER CHARACTER STRING" , 02290905
1 " 123456789012345678901234567890123456789" / 02300905		1 " abcdefghijabcdefghijabcdefghijabcdefghi" / 02300905
2 " ","012345678901234567890123456789012" ) 02310905		2 " ","jabcdefghijabcdefghijabcdefghijab" ) 02310905
CT010* TEST 10 - SEVERAL IDENTICAL VALUES 02320905		CT010* TEST 10 - SEVERAL IDENTICAL VALUES 02320905
IVTNUM = 10 02330905		IVTNUM = 10 02330905
WRITE (NUVI, 80004) IVTNUM 02340905		WRITE (NUVI, 80004) IVTNUM 02340905
WRITE (NUVI, 80020) 02350905		WRITE (NUVI, 80020) 02350905
IVI = 5 02360905		IVI = 5 02360905
JVI = 5 02370905		JVI = 5 02370905
KVI = 5 02380905		KVI = 5 02380905
LVI = 5 02390905		LVI = 5 02390905
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

Fortran/UnitTests/fcvs21_f95/FM905.reference_output

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	7 INSPECT
-3 15.25 HELLO T		-3 15.25 HELLO T
8 INSPECT		8 INSPECT
COMPUTED=		COMPUTED=
5 O'CLOCK		5 O'CLOCK
CORRECT=		CORRECT=
5 O'CLOCK		5 O'CLOCK
9 INSPECT		9 INSPECT
COMPUTED=		COMPUTED=
SHORTTHIS IS A LONGER CHARACTER STRING123456789012345678901234567890123456789012345678901234567890123456789012		SHORTTHIS IS A LONGER CHARACTER STRINGabcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcdefghijabcdefghijab
CORRECT=		CORRECT=
SHORT THIS IS A LONGER CHARACTER STRING 123456789012345678901234567890123456789		SHORT THIS IS A LONGER CHARACTER STRING abcdefghijabcdefghijabcdefghijabcdefghi
012345678901234567890123456789012		jabcdefghijabcdefghijabcdefghijab
10 INSPECT		10 INSPECT
COMPUTED=		COMPUTED=
5 5 5 5 5		5 5 5 5 5
CORRECT=		CORRECT=
5 5 5 5 5 OR 5*5		5 5 5 5 5 OR 5*5

-------------------------------------------------------------------------------		-------------------------------------------------------------------------------

Show All 11 Lines

Fortran/UnitTests/fcvs21_f95/README

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines

	Acknowledgment:			Acknowledgment:
	The present version has been slighly altered in the following way:			The present version has been slighly altered in the following way:
	- a non standard conforming FORMAT statement has been fixed in FM110.f.			- a non standard conforming FORMAT statement has been fixed in FM110.f.
	- Hollerith strings in FORMAT statements have been converted to quoted			- Hollerith strings in FORMAT statements have been converted to quoted
	strings to conform to the Fortran 95 standard.			strings to conform to the Fortran 95 standard.

	Modifications:			Modifications:
				2022 June by Diana Picus
				- replaced digits with letters in test 9 from FM905.f
	June 10 by Nichols A. Romero			June 10 by Nichols A. Romero
	- modified driver_run input and output files to make it easier to update LLVM Test-Suite			- modified driver_run input and output files to make it easier to update LLVM Test-Suite
	- remove a number of problematic tests, see CMakeLists.txt			- remove a number of problematic tests, see CMakeLists.txt
	June 11 by Nichols A. Romero			June 11 by Nichols A. Romero
	- adjust I0? logical unit (I06,I08,etc.) for many tests to avoid race conditions when			- adjust I0? logical unit (I06,I08,etc.) for many tests to avoid race conditions when
	running in parallel			running in parallel
	- rename CSEQ, DIRFILE, CDIR for many tests to avoid race conditions when running in			- rename CSEQ, DIRFILE, CDIR for many tests to avoid race conditions when running in
	parallel			parallel
	June 12 by Nichols A. Romero			June 12 by Nichols A. Romero
	- remove `driver_parse` script since it is not needed or used.			- remove `driver_parse` script since it is not needed or used.
	June 23 by Nichols A. Romero			June 23 by Nichols A. Romero
	- Added comments regarding the use of the `driver_run` script.			- Added comments regarding the use of the `driver_run` script.